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Abstract 

The concepts of event and anomaly are important building blocks for developing a situational picture of the observed environ- 
ment. We here relate these concepts to the JDL fusion model and demonstrate the power of Markov Logic Networks (MLNs) 
for encoding uncertain knowledge and compute inferences according to observed evidence. MLNs combine the expressive power 
of first-order logic and the probabilistic uncertainty management of Markov networks. Within this framework, different types of 
knowledge (e.g. a priori, contextual) with associated uncertainty can be fused together for situation assessment by expressing un- 
observable complex events as a logical combination of simpler evidences. We also develop a mechanism to evaluate the level of 
completion of complex events and show how, along with event probability, it could provide additional useful information to the 
operator. Examples are demonstrated on two maritime scenarios of rules for event and anomaly detection. 

Keywords: Context-based fusion. Situational Awareness, Uncertainty management, Markov Logic Networks 



1. Introduction 

State-of-the-art situation assessment (SA) systems (e.g. an 
automatic surveillance system [1]) are able to deal with vast 
amounts of data and information also of a heterogeneous kind. 
Their goal is to provide a constantly updated situational picture 
about the observed environment or set of entities to an opera- 
tor in order to facilitate human decision making. Updating the 
current system representation of the situation is generally per- 
formed by acquiring, through sensors or other sources of infor- 
mation, new observations which provide a possibly incomplete 
and uncertain view. 

Currently, low-level sensory data is the main source of infor- 
mation used to understand the observed evolving scenario and 
to identify anomalous conditions; in particular, up to now mar- 
itime surveillance heavily relies on the Automatic Identification 
System (AIS), coastal radars, space-based imagery, and other 
sensors, to form a picture in which the operator can recognize 
complex patterns and make decisions [2, 3], 

Anomaly detectors or event recognition systems for maritime 
situational awareness are presented in [4, 2, 5, 6, 7, 8, 9, 10]. 
The common thread that unites these works is the definition 
of an expert system, that aims at detecting a set of anomalous 
behaviours or potential threats. Subject matter experts define 
a knowledge base (KB), which comprises the possible abnor- 
mal patterns the target could follow; then, on the top of it, a 
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Figure 1 : Illustration of an ideal maritime situational awareness situation. The 
sensory data for an object of interest must be coupled by high-level contextual 
information. 

reasoning engine queries the occurrence of an anomaly for a 
target object in an arbitrary time instant. For example, in [2] 
AIS data is used for extracting statistical behaviours of mo- 
tion patterns, while in [5] situational awareness is achieved fus- 
ing knowledge-based detection with data-driven anomaly de- 
tection. In [4] a comprehensive literature survey of the anomaly 
detection process via data analysis is presented; definitions of 
anomaly and normalcy, explored under the light of decision 
making systems, are given in order to support the analytical 
reasoning process. 

The main goal of a reasoning engine or probabilistic infer- 
ence system is to associate a posterior probability distribution to 
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a set of queries [11], given observed evidence. The incorpora- 
tion of abductive/inductive and deductive inferencing processes 
is a vital element in an automatic fusion system, and it repre- 
sents a fundamental step for situational awareness. How this 
involvement can be obtained, on both theoretical and applica- 
tive levels, is a crucial point, and is subject of ongoing research 
[ 12 ]. 

The reasoner is usually fired by low-level observations pro- 
vided by sensors, covering in this way the majority of abnor- 
mal situations in the domain; however, it is interesting to no- 
tice how anomalous behaviours do not always follow standard 
trends or well-known patterns, especially if related solely to 
vessels movements, but sometimes they take the form of seem- 
ingly unrelated activities on a larger scale [13]. Ship-centric 
focus should be replaced by a broader vision, where the ideal 
situational awareness system should then be flexible and adap- 
tive enough to integrate both low-level and high-level informa- 
tion, detecting anomalous or suspicious conditions by reason- 
ing on manifest or uncertain data, but also on (apparently irrel- 
evant) relations among objects, which may reveal unobserved 
coincidences. The maritime domain is a daunting scenario for 
testing such systems, because of many factors; its challenging 
nature where the coverage of wide areas is given by discontin- 
uous and intermittent sensory data, its well-known commercial 
policies and practices which can suggest normalcy behaviour 
patterns, the presence of local contextual information, stable in 
time, which can depict alternative indicators of multi-layered 
situations, and the urgency for systems capable to provide ef- 
fective and advanced warning to promote countermeasures to 
illicit activities. 

The integration of contextual knowledge, as demonstrated in 
[14, 15, 16] where it is exploited for improving tracking ac- 
curacy, can greatly enhance the performance of an awareness 
system. Despite its value, the representation and use of context 
is often poorly integrated, if not absent, even if the richness and 
completeness of this information is extremely useful to prop- 
erly interpret the available stream of raw sensor data from a 
multitude of points of view (security, safety, economical or en- 
vironmental situation, etc.). Qualitative high-level knowledge 
can help to infer about hidden states from low-level data gen- 
erated by sensors, other fusion processes or human reports. In 
other words, context is a powerful means to picture a broader 
and deeper operational situation, as it can reduce uncertainties 
where normally analysts would need to be consulted. 

In this paper, we exploit MLNs to encode uncertain knowl- 
edge, fuse data coming from multiple (and possibly heteroge- 
neous) sources, and perform reasoning on incomplete data. One 
key point of using the MLNs for reasoning is their ability to 
reason with incomplete or missing evidence, which is a cru- 
cial feature hardly found in other approaches, but sought after 
especially in the maritime domain, where the data is often in- 
accurate, delayed or simply not available. Another advantage 
with respect to other systems, is the fact that MLNs support in- 
consistencies or contradictions in the knowledge base, which 
is a problem when different experts provide contributes to it. 
This avoids non-trivial knowledge engineering techniques to be 
performed in order to guarantee rules consistency. Here we use 



Markov Logic Networks (MLNs) to detect two possible anoma- 
lous conditions in maritime domain, a rendezvous at sea and a 
hazardous combination of cargo ships in a harbour. 

We use exemplary scenarios, the first one derived from ex- 
perts’ suggestions gathered at the NATO STO Centre for Mar- 
itime Research and Experimentation and the second one ex- 
panded from [17], to highlight how unobserved complex events 
could be built by logical combination of simpler evidence, and 
how contextual information is extremely valuable in many con- 
ditions. MLNs present advantages suited to our domain as they 
support reasoning with missing or partial observations (incom- 
plete evidence), they allow to encode expert rules and relational 
knowledge with an associated degree of uncertainty, they are 
able to handle contradictions and inconsistencies [18]. 

Preliminary investigation on MLNs in maritime domain has 
been initiated in [19], where we leveraged the expressive power 
of first-order logic (FOL) and the probabilistic uncertainty man- 
agement of Markov networks in order to detect anomalies via 
reasoning on uncertain knowledge. Here we aim to expand and 
refine that work by providing contributions for: 

• clarifying the concepts of event (simple and complex) and 
anomaly in the scope of fusion terminology; 

• explicitly explaining how simple and complex events can 
be encoded in the form of FOL formulas with associated 
degree of uncertainty in maritime domain; 

• demonstrating how MLNs could provide a powerful tool 
for fusing heterogeneous sources (e.g. a priori, contextual, 
sensory, etc.) of information for situation assessment by 
being able to express unobserved complex events by logi- 
cal combination of simpler evidences; 

• developing a mechanism to evaluate the level of comple- 
tion of complex events as this calculation is not directly 
solvable within the MLNs framework. 

1.1. Terminology 

To facilitate human decision making, an updated situational 
picture of the observed environment assessing the current state 
of domain entities and their relationships is required. Events 
and anomalies can be considered fundamental building blocks 
for developing such a picture of the environment. In this sec- 
tion, we provide the necessary definitions of these concepts and 
relate them to the JDL fusion model [12], In the following, the 
term level will be used as per JDL terminology. 

While there are many papers in the literature that deal with 
events and provide various definitions [20], we here break- 
down the main concepts in light of the typical functionality and 
requirements of a S A system. An event modelling framework in 
maritime domain was recently presented in [21], where a piracy 
example is presented with the intent of facilitating the decision 
making process, but no reasoner is associated to the graphical 
representation of events. 

For our purposes an event is a “significant occurrence or hap- 
pening”. It can be subdivided in simple, when we consider the 
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Table 1 : Examples of events and anomalies at different JDL levels 



JDL Level Event Type Anomaly 

0 Absence of AIS signal simple AIS off 

1 Vessel increased speed simple Vessel speed over limit 

2 Vessel X stopped. Vessel Y stopped. Vessel X and Y are close complex Vessel X and Y are having a rendezvous 




time 

Figure 2: Example of the detection of a complex event by observing the oc- 
currence of its components. C 2 is composed by complex event C\ and simple 
event S 3 . All the events in this example are non-instantaneous as each of them 
spans a certain interval of time. 

variation of a quantity or state, or complex , which is a combi- 
nation of atomic or complex activities [20], Figure 2 gives an 
intuitive representation of the idea. 

A simple event, is any significant variation of input data, at 
any level, discernible by the system. Also called atomic in the 
literature, we here use the term simple to avoid confusion with 
ground atoms defined in Section 2. They can be directly ob- 
servable or not, and can be either instantaneous or last for an 
arbitrarily long period of time. As the name implies, this is 
the most basic type of occurrence and cannot be further decom- 
posed into simpler constituting events. 

More in general, variations of input signals (Level 0), of a 
target’s state (e.g. speed, direction, etc. that can be included in 
Level 1), of a target’s relation with other entities (Level 2), are 
all examples of simple events. 

Complex events are a combination of two or more compo- 
nent events (simple or complex) that can be arbitrarily com- 
bined through logical operators (A, V, -i) to encode articulated 
expert and domain knowledge. Complex events can be either 
triggered by a specific time-ordered sequence of component 
events, or be just an unordered collection of them. In addition, 
a complex event can be composed by a heterogeneous combi- 
nation of events generated by data at different levels. Gener- 
ally, complex events span a certain interval of time. They could 
have a fixed time-frame, that is constituting events have to oc- 
cur within a given time window, or not as they wait indefinitely 
for all the component events to happen. 

Simple and complex events will be represented in Section 2 
by predicates, with the difference being that a complex event 
will appear only as the consequent of an implication (i.e. on the 
right side of an implication) as it cannot be directly observable 
but only inferable from the detection of its components. 

An anomaly can be considered a critical event to which the 
system is generally called to react to. Usually, a threshold es- 




Figure 3: Implicit knowledge modelling. Exemplification of ship trajectories 
being clustered [23] to model normal movements patterns. A trajectory leaving 
known clusters would be flagged as anomalous. 

tablishes if input data can be considered unexpected or anoma- 
lous, thus raising an exception. Thresholds, provided by do- 
main experts or learned automatically by the system from data, 
are therefore used to immediately spot an anomalous condi- 
tion. However, anomalies provide no notion whatsoever on the 
meaning of the exceptional input. Following this definition, an 
anomalous event is an occurrence of some type that deviates 
from expected values or behaviour. In a Situation Assessment 
system, the knowledge base is consulted to infer a possible con- 
clusion from the anomalous condition. 

An exhaustive description of anomalies taxonomy in mar- 
itime domain can be found in [22], where the author provides a 
clear and comprehensive classification of possible events of in- 
terest, grouping them by kinematic and non-kinematic patterns 
and providing an ontology of possible anomaly causes. 

Events and anomalies can be defined by explicit or implicit 
models in the system. In the former case the model encodes, 
usually exploiting expert and contextual knowledge, the com- 
plete description of what an (anomalous) event is. On the con- 
trary, implicit modelling means that samples of activities are 
unsupervisedly learned by the system in order to detect a devia- 
tion from common patterns. Trajectory clustering (as shown in 
Figure 3) is an example of technique for extracting and main- 
taining (low-level) knowledge about normal movement patterns 
[23], 

“High (low)-level event” or “High (low)-level anomaly” is 
something, to the best of our knowledge, never properly for- 
malized. Following the description given in the above sections 
and taking into account JDL levels, our position here is that 
whenever the system detects any appreciable variation of input 
data of any level, a corresponding event is generated. Table 1 
shows some examples of events generated from data at differ- 
ent levels. For instance, the detection of presence or absence of 
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AIS signal is something that can be considered at the bottom of 
the JDL hierarchy, while the speed of a vessel is a feature of its 
state and belongs to Level 1. Two stopped vessels very close 
out at sea is a relation between two entities and helps defining 
the current situation (JDL level 2). 

It is not true then, that, to flag a situation as anomalous, data 
and information have to bubble up through the levels following 
increasing processing and refinement steps. Anomalies can be 
generated from data of every kind and level as shown in Table 
1. For example, the absence of AIS signal can be directly con- 
sidered something anomalous, as well as a speeding boat or a 
rendezvous out at sea. Also, anomalies could be generated both 
from simple and complex events as Table 1 exemplifies. 

1.2. Related work on reasoning systems 

Expert systems, also known as rule-based systems, are often 
used to represent high-level contextual information and to de- 
scribe the events to be detected [9]. Simple if-then-else rules 
have the advantage to be easy to code and extremely effective 
to flag a suspicious event or anomaly. Unfortunately, in most 
of the cases the reasoning engine is not refined enough, but 
simply the result of an binary process that may lead to dras- 
tic decisions with no degrees of incertitude. In a step forward, 
uncertainty can be coupled to the rules in the knowledge base 
as in the case of MYCIN [24], or can be interpreted as prob- 
abilities when Bayes’ rule is used as the basis of inference, as 
in Prospector [25]. A major drawback of these systems is that 
rule-based systems act as a monolithic chain that triggers the 
rules only when complete evidence is available. Whereas a 
more natural behaviour would be to infer (abduct or deduct), 
with different degrees of information quality or reliability [26], 
the missing pieces of information from a priori knowledge to 
draw a general picture even in absence of direct observations. 

An improvement is provided by Description Logics, which 
represent a formalization of Semantic Networks as an exten- 
sion of classical logic and are very useful for intuitively repre- 
senting knowledge. Although sound, they do not offer support 
for uncertainty, even if the structure graph enables the support 
of multiple hypotheses. From the need for supporting uncer- 
tainty and vagueness for reasoning under probabilistic uncer- 
tainty in ontologies, probabilistic and fuzzy description logics 
have stemmed, extending classical DLs to deal with numerical 
probabilities or fuzzy truth values [27], 

Dealing with uncertainty is one of the most desirable char- 
acteristics for a fusion system [28], as uncertain data affects 
decisions and the quality of estimates. Uncertainty is defined as 
the lack of exact knowledge, which would allow us to formu- 
late a reliable conclusion. Uncertainty is generated when logic 
fails, according to Russell and Norvig [11], because laziness or 
theoretical or practical ignorance are introduced in the model- 
lization of the problem. In particular, laziness refers to the will 
to model the domain with less rules than necessary, while ig- 
norance occurs when the theory is lacking in some respect or 
part of the data is missing. Probability theory provides a way 
to overcome and represent the uncertainty that derives from ig- 
norance; on this side, Bayesian networks provide a tractable 
solution. Already extensively used in surveillance domain (see 



[29] for a recent survey), in [6] they have been used for assess- 
ing the threat probability obtained by the combination of five 
types of anomalies or abnormal behaviours, that are deviation 
from standard routes, unexpected AIS activity, unexpected port 
arrival, close approach and zone entry. Despite being so largely 
used for probabilistic representations of uncertain knowledge, 
Bayesian networks have strong limitations, including the fact 
that they allow reasoning about the same fixed number of at- 
tributes, as their nature is essentially propositional: the set of 
random variables is fixed and finite, and each has a limited do- 
main [11], As result, their application to complex problems is 
often impeded, as they require to define in advance with con- 
fidence how many entities will be involved, and what type of 
relationships intercur among them. Even in Hidden Markov 
Models, which have been used in case of temporally and spa- 
tially distributed observations for event recognition [30, 31], the 
number and type of states must be specified in advance. This 
last condition largely impacts on performances when scaling up 
to a larger size scenario, reducing their applicability in a flexible 
and uncertain domain as the maritime. 

Ontologies are another popular means to encode knowledge 
and represent relationship among entities [31, 32, 7, 33]. In 
[10], a rule-based system is coupled with ontologies for auto- 
matic discovery of anomalies in maritime domains. An attempt 
to integrate taxonomical knowledge through OWL ontologies 
and rule-based knowledge through SWRL rules is described in 
[20] for video surveillance. While the paper presents an action- 
able solution by exploiting freely available software tools and 
libraries, on one hand it lacks the full expressiveness of FOL as 
SWRL provides only a limited support for zeroth-order logic. 
On the other hand it does not couch uncertainty in principled 
way if not through an extension for fuzzy reasoning [34]. The 
same issues are present in [32] and [7]. In the first case a hybrid 
approach is presented to fuse ontology-based context represen- 
tation, and deductive and abductive reasoning for detection un- 
der uncertainty of abnormal objects from their characteristics 
and behaviour. In the latter work, PR-OWL is coupled with a 
Multi-Entity Bayesian Network for vessel of interest identifica- 
tion. 

A much more powerful tool is FOL, which, with respect to 
propositional logic, is enough expressive to represent complex 
environments in a concise way. A difference relies in the on- 
tological commitment [11], that is a concept that involves the 
reality and its representation by means of a model: proposi- 
tional logic assumes that a set of mutually exclusive (true or 
false) facts hold or not in a certain world, linking symbols to 
values, while FOL considers that objects and their relations 
(predicates) do or do not hold, specifying much richer environ- 
ment semantics. While being only semi-decidable, this prob- 
lem is generally mitigated by attempting to convert the KB into 
Horn clausal form which is commonly used by inference en- 
gines [18], 

Other possible decision theory tools include Dempster- 
Shafer theory (DS), from which the Dezert-Smarandache for- 
malization has been derived [35], and fuzzy logic. Dempster- 
Shafer approaches, also known as evidence theory or theory 
of belief functions, allows to represent the evidence of differ- 
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ent levels of abstraction, with the possibility of distinguish- 
ing between uncertainty and ignorance. With this respect, it 
is more flexible than classical Bayesian theory when dealing 
with incomplete knowledge and has explicit mechanisms for 
decision support. However, Dempster-Shafer theory cannot be 
used directly to encode expert knowledge in terms of complex 
sequences of events. 

As it concerns fuzzy logic, in the maritime domain Balmat 
et alt. applied this technique to perform a risk assessment in 
a ship-centric system [36]. Fuzzy logic has been successfully 
used for event detection and recognition in the past [37], as it 
enables to take into account insufficient information, dealing 
thus with imprecise data, and the evolution of available knowl- 
edge. 

Statistical Relational Learning (SRL) is an emerging research 
area that aims to represent, reason and learn in domains with 
complex relational and rich probabilistic structure [38], SRL 
techniques have a very strong potential of application to SA 
systems for their ability to model dependencies between related 
instances (e.g. any combination of relations among observed 
entities or between a set of targets and the environment). 

Markov Logic Networks is a recent SRL technique that at- 
tempts to unify the world of logic and probability [18]. MLNs 
are able to encode expressive domain knowledge through FOL 
formulas, and handle typically uncertain sensory data in a prob- 
abilistic framework that takes into account relations and depen- 
dencies through a graphical model (Markov Networks). MLNs 
have been applied to video surveillance systems for event de- 
tection in [39], where they are shown as a powerful fusion tool 
to combine observations coming from multiple heterogeneous 
sources of information. 

Described in Section 2, MLNs will be here applied to the 
maritime domain where the dynamics and relations of many 
targets have to be captured through a multiplicity and hetero- 
geneity of sensors and sources of information (including a pri- 
ori knowledge, contextual information, and human-generated 
reports) in a vast and complex scenario. The overall goal is 
to provide a comprehensive situational picture to the opera- 
tor. In our case, uncertainty is generated by the intrinsic am- 
biguity of subjective opinions of experts (soft data[40]), as no 
systematic and universal method exists to perfectly formalize 
the scenario: conflicts, approximations, imprecisions and con- 
tradictions can generate inexact, incomplete or unmeasurable 
information. Thus, attaching a weight to each formula in the 
knowledge base is a way to recognize the quality of its source 
and incorporate it into the reasoning process. 



do not fully satisfy the KB. Therefore, the fewer formulas a 
given world violates the more probable it is. 

An MLN is then a set L of pairs (/-], w,) where F, is a FOL 
formula and w,- its corresponding real-valued weight. The set 
of all formulas F, in L constitutes the KB while the weight vv,- 
associated to each F, reflects how strongly the constraint im- 
posed by the formula is to be respected. This impacts directly 
the probability assignment: worlds which satisfy a high weight 
formula are going to be much more probable than those that do 
not. 

A Markov Logic Network L together with a finite set of 
constants C defines a Markov network M L C that models the 
joint distribution of the set of random (binary) variables X — 
(Xi,Xo, e X. Each variable of A is a a ground atom 

(predicate whose arguments contain no variables) and X is the 
set of all possible worlds, that is the set of all possible truth 
value assignments of n binary variables. The network is built as 
follows: 



• XI i c contains one (binary) node for each possible ground 
atom given L and C 

• An edge between two nodes indicates that the correspond- 
ing ground atoms appear together in at least one grounding 
of one formula in L. Ground atoms belonging to the same 
formula are connected to each other thus forming cliques. 

• A feature f) is associated for each possible grounding of 
a formula F; in L. Each f assumes value 1 if the corre- 
sponding ground formula is true and 0 otherwise. 

The probability distribution over X taking values x e X speci- 
fied by Mi c is given by: 



P(X = x) = - exp 



\L\ 



Y w,n,(x) 



Ki= 1 



( 1 ) 



where \L\ indicates the cardinality of L, thus counting the num- 
ber of formulas of the knowledge base, and n,(x) is the number 
of tme groundings of F, in the world x. 



\U 

Z = Y exp Y w ‘ n i( x ') 

x'eX \!=1 / 



( 2 ) 



is a normalizing factor often call partition function. 

Given the joint distribution function in (1), it is possible to 
calculate the probability that a given formula F t holds given the 
Markov Network Mpc as follows: 



2. Markov Logic Networks 

We here provide essential background notions of Markov 
Logic Networks, but the reader is advised to refer to [18] for 
further details. MLNs are a powerful tool for combining log- 
ical and probabilistic reasoning. While a knowledge base of 
logic formulas is satisfiable only by those worlds (truth values 
of atomic formulas) in which it is true, a MLN relaxes this hard 
constraint by associating a probability value to the worlds that 



P{Fi\M uc ) 



Y P(X = M M l ,c) 

xgX f. 



: exp 



' \ 

Y w ' n ' (x '> 

k xeXf. 



(3) 



where Xp t is the set of worlds where F, holds. 

While (1) provides the probability of configuration x of truth 
values for the ground atoms in the Markov Network, (3) can be 
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used instead to evaluate the probability that a formula F t (e.g. 
a predicate representing an event) holds given M / c where C is 
composed by observed entities and other constants provided by 
contextual knowledge. This gives a glimpse of the power of the 
framework, an arbitrary formula that can be grounded in M^c 
can be queried to get the probability of being true. Thus not 
only the formulas in L but also any logical combination of them 
that can be grounded in the Markov network can be queried as 
well. This is extremely important for a SA system where the 
operator might want to evaluate the truth degree of a new (com- 
plex) event or condition as the combination of existing evidence 
in the KB. The framework can also provide the probability that 
a formula Fo holds given that formula F\ does or provide an 
answer to weather the KB entails a given formula. 

According to the definitions given in Section 1.1, an MLN 
provides an explicit way of encoding knowledge. However, 
both rule weights and the rules themselves can be be learned 
from data. These capabilities make MLNs a powerful tool that 
combines the benefits of both implicit and explicit modelling. 

One point that should be highlighted is that the Markov net- 
work of ground atoms is comprised, as already mentioned, by 
a set of binary random variables each of which constitutes a 
node in the graph. In our application, the grounding of a 
predicate can happen due to a priori or contextual knowledge 
(e.g. harbour(H\ ) where Hi is a known location such as “La 
Spezia”) or accrued sensory evidence (e.g. cargo(V ) ) where V \ 
corresponds to an observed vessel). An example of Markov 
Network is shown in Figure 4. The truth value of grounded 
predicates can be provided by external knowledge and obser- 
vations or inferred by the reasoning engine. For example we 
might know that Hi is a harbour while IF is not. This would al- 
low to say that harbour)//)) is true and that harbour)/^) is false 
thus providing binary input to the system. However, one would 
like to add an additional level of uncertainty, that is observation 
uncertainty, to encode the naturally imprecise or uncertain na- 
ture of data coming from sensors and human observers. For ex- 
ample, the ground predicate proximity(Vi,V 2 ) asserts the vicin- 
ity of vessels Vi and Vo according to a certain threshold. The 
measured distance could be affected by error leaving thus a mar- 
gin of uncertainty on the truth value of the predicate. To encode 
such observation uncertainty, a possible solution, proposed in 
[39], involves adding a single atom rule to the rule base with 
associated weight proportional to the detection probability of 
the observed evidence. Observation uncertainty will not be em- 
ployed in the experiments of Section 4 and is instead left to 
further investigation and future work. 

3. Completion of complex events 

As described in Section 1.1, a complex event is the logical 
combination of two or more events and therefore cannot be di- 
rectly observed but has to be inferred by observing the occur- 
rence of component events. As suggested in Section 1.1, it is 
often the case for SA systems that anomalous or critical situa- 
tions have to be deduced from a number of indicators derived 
from apparently normal behaviour of observed entities. Indi- 
cators take generally the form of simple events but this is not 



a rule and a complex event can be composed by an arbitrary 
combination of simple and other complex events. 

In order to prepare a timely response, adopt adequate 
counter-measures, or simply plan in advance future actions, it 
would be of course useful to detect a critical condition when it 
is about to happen. It would be interesting then to detect com- 
plex events that are “almost completed” or, in our logic setting, 
“almost true”. In other words, it would useful to provide to 
the operator a continuously updated indication of how complex 
events are building up. Take for example the rule 

cargo(y ) A isHeadingT o(v ,h) A harbour(h) A riskih, “high”) 

=> alarm(v) 

where the unary predicate alarm marks the occurrence of a 
complex (critical) event involving vessel v. Since the an- 
tecedent is composed by the conjunction of four predicates 
(events), it evaluates to true only when all four predicates do. 
It would be interesting to know how much the current world 
is satisfying the implication. For example, if the first, third and 
fourth predicates of the antecedent were true then we would say 
that the rule is at 75% of its completion and could poentially 
trigger if the remaining predicate evaluated also true in the fu- 
ture. A condition like the one exemplified could attract the at- 
tention of the operator or of the system itself that might decide 
to conduct further investigation on vessel v directing sensing 
capabilities or information providers to acquire additional data 
(see JDL level 4 [12]). 

The formulation of a priori (e.g. contextual) knowledge and 
observed evidence within the MLN framework allows the eval- 
uation of complex events by querying the truth value of the 
associated (consequent) predicate. In the example above, the 
complex event encoding a critical condition is represented by 
the alarm predicate on the right side of the implication. How- 
ever, the probability of such predicates does not necessarily re- 
flect their completion condition. While it could work nearly 
as expected for double implications (<=>), in the case of implica- 
tions, as in the example above, the antecedent is only a sufficient 
condition for the consequent to be true but is not a necessary 
one. Therefore, when the antecedent is false, is says nothing 
about the truth value of the consequent. In addition, the con- 
sequent could be involved in other formulas and be subject to 
their effects. 

Evaluating the probability of the antecedent being true does 
not provide a solution either as said probability does not match 
the completion concept. If for example the weight the formula 
expresses a hard constraint, then regular FOL holds and one 
false predicate brings the truth value of a conjunction to zero 
(as in the example above). 

Therefore, to evaluate the level of completion of complex 
events we use the following procedure for each formula F in 
the KB representing a complex event: 

1. Convert F into Conjunctive Normal Form 1 (CNF) where 
grounding of variables is done according to observations 
and contextual knowledge; 



1 conjunction of disjunctions of literals 
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Figure 4: Example of Markov network. The network is obtained by grounding formulas #18 and #19 in Table 3 with constants VI and V2 referring to two observed 
vessels. 



2. Evaluate the truth value of ground predicates according to 
available evidence. In the case of atoms whose truth value 
is unknown, the Closed World Assumption 2 (CWA) holds; 

3. Let Ej be FOL literals representing simple events, then re- 
cursively compute the completion of F by using the fol- 
lowing two rules: 

• if (Ei A ■ • • A E k ) is a conjunction of K events (predi- 
cates) then completion(E\ A ■ • - A E K ) - Yjk=i (P(Ek)) 

• if (E i V • ■ ■ V E:\f) is a disjunction of N events then 
completion(E\ V ■ ■ • V E n) = max (P(E n )) 

n 

In other words, the completion of a conjunction of events is 
given by the average of their probability. If the events are of 
complex type they must be recursively evaluated by applying 
the same rules. The probability of each atom is obtained query- 
ing the Markov Network as per (3). The probability of literals 
whose truth value is unknown is set to zero, the same is done for 
not yet grounded predicates. Please note that each atom could 
be either a positive or negative literal and that the probability 
being evaluated for each atom is the probability of it being true. 

The completion of a disjunction of events is instead obtained 
by taking the maximum probability value among the events in 
the disjunction. Completion calculation is further exemplified 
in Table 2, where the first two rows show how again the calcu- 
lation for conjunctions and disjunctions and the third row ex- 
emplifies the case of complex event whose event tree is shown 
at the bottom left. 

The application of the CWA sets to zero the probability of 
any predicate that has no supporting evidence for being true. 
This means that before invoking the CWA both external evi- 
dence (i.e. actual observations) and (possible) inference effects 
should be checked for ground atoms with unknown truth value 
every time their completion level is to be evaluated. 

Since the completion of events is calculated only after the 
evaluation of the probability values of the predicates through 
the MLNs framework, it does not affect inference and it is used 
only as an indicator for the operator or the system. The choice 
of the CWA provides a prudent look by setting to zero the prob- 
ability of unobserved events being true. The open world case 
could be practically dealt with in several ways. One example 



According to the Closed World Assumption, atoms with unknown truth 
value are considered false [18]. 



Table 2: Completion of complex events. Being Ej FOL literals representing 
simple events, the first two rows show the how the completion is calculated in 
the case of conjunction (A) and disjunction (V) of component simple events 
respectively. The third row (bottom) shows an example of a complex event 
whose event tree comprises a combination of both logical operators. 



Event tree 


Logic formula 


Completion 


y y - y 


(El A A Ek) 


i Zli ( P(Ek )) 


Jh 

V 

3 y y 


(Ei V • • • V E n ) 


max (P(E n )) 

n 

n = {l 


0 

0" 0 
O 0 


(Ei A E 2 ) V £ 3 


max( / ’ (K '™ ) ,P(£ 3 )) 



could be to set the probability of unknown predicates to 0.5. A 
better option could be to compute the completion as in the CWA 
case (probability of unknown literals set to zero) but to explic- 
itly highlight to the operator the number of false and unknown 
events explicitly. 

4. Knowledge representation and experiments 

The creation of a knowledge base implies the use of a rep- 
resentation formalism to code Subject Matter Experts (SME) 
knowledge into formulas. The domain, that is a part of the 
world about which we want to express sentences, is represented 
by a set of assertions, also said formulas, in FOL, which guaran- 
tees a precise semantic characterization. In our experiments we 
set manually the MLN weights, which represent the uncertainty 
of each rule, as sample data was not available to unsupervisedly 
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learn the anomaly model. Weights and rule learning from data 
will be investigated in future work. When a large amount of 
data is available, it is preferable to train the MLN as described 
in Section 2. 

We start defining a knowledge base that will model our do- 
main by describing entities and their relationship. We use a 
maximum weight a> that is proportional to the number of the 
ground atoms [18] to define the highest certainty about a rule, 
that is a hard constraint; fractions of u> can be assigned to less 
confident formulas [39], 

Contextual information is provided by a human operator, and 
observed evidence necessary to ground the MLN is simulated 
as already processed sensory data. It is important to distinguish 
the role of these two sources of information. A static entity and 
the associated resources or characteristics can be described a 
priori by a human operator; this knowledge can be updated in 
time when some of these features vary, but the entity is (almost) 
permanent in the domain. On the contrary, evidence about mov- 
ing or non-static objects is created on-the-fly when needed, it 
is not permanent and it can vary in time. For this reason, we 
must distinguish between contextual evidence, that is contex- 
tual (static) information, and observed evidence that refers to 
sensory data regarding a specific vessel of interest in a certain 
instant of time. 

Two main scenarios will be examined, discussing the impact 
of contextual information injection. In the first case, a possible 
rendezvous (stow ’n’ go) between ships is depicted; maritime 
experts have been consulted to highlight common patterns and 
to create the knowledge base after the situation definition. In the 
second scenario, extended from [17], the threat is represented 
by a dangerous combination of material carried by cargo ships 
that arrive at adjacent berths simultaneously. 

To run experiments with the Markov Logic Networks we 
used Alchemy 3 , a library for statistical relational learning and 
probabilistic logic inference based on the Markov logic rep- 
resentation, integrated in a graphical interface provided by 
probCog 4 , which allows to easily code rules and set Alchemy’s 
parameters. 

Alchemy offers the possibility to learn the weights associated 
to the KB formulas from a set of sample patterns, in the form of 
true/false atoms. In our experiments we set manually the MLN 
weights (Table 12 and Table 3), which represent the uncertainty 
of each rule, as sample data was not available to unsupervisedly 
learn the anomaly model. Weights and rule learning from data 
will be investigated in future work. 

MLN inference hinges on one side on Markov Network in- 
ference (#P-complete) and on FOL inference (NP-complete) on 
the other. However, properly exploiting the structure of the net- 
work, MLN inference is in some cases more efficient than in 
standard FOL [18]. In our experiments, we run approximate in- 
ference using the MonteCarlo-SAT (MC-SAT) algorithm with 
5000 maximum steps. 



3 http://alchemy.cs.washington.edu/ 

4 http://wwwbeetz. informatik.tu-muenchen.de/probcog-wiki/index.php 
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Figure 5: Entities and relations of the proposed “rendezvous" example. 



4.1. Rendezvous scenario 

The first scenario aims to detect a rendezvous act, that is a 
meeting of two vessels for trafficking or smuggling of people or 
goods (drugs, food, oil, etc.). The two vessels usually have no 
transponder system activated, to be undetected, and commonly 
they meet offshore, far away from the coast. A less frequent 
case is when a small boat would rendezvous with the smug- 
glers’ mother ships, usually cargo or large vessels. 

The rendezvous represents a complex event, and is a binary 
relation as it involves two objects of the same type (vessel). The 
complete diagram of entities involved in this scenario is shown 
in Figure 5. The focus is a vessels-centric structure, with the 
main entity having attributes as stopped, insideCorridor and 
AIS to indicate that the ship has stopped, navigates inside a 
virtual traffic corridor (allowed area), and has AIS transponder 
on. The stopped predicate implies a low-level action where the 
speed over ground (SOG) or the position of the ship is moni- 
tored, as well as AIS requires a translation of a low-level signal 
into an event. The insideCorridor predicate requires a prior 
definition of navigable zones and normalcy traffic corridors (as 
in Figure 3), but their modelization is out of the scope of this 
work (see for example [23]). 

To describe complex events, which often imply a tempo- 
ral sequence of facts, we employ Allen’s temporal logic [41]. 
Allen’s Interval Algebra provides a composition table for rea- 
soning about the relations that occur between temporal inter- 
vals. Other predicates as overlaps and meets are unary relations 
and indicate a temporal link between two vessels. 

Another attribute represents HUMan INTelligence 
(HUMINT) reports, which consist in additional informa- 
tion about suspicious vessels. The other entity in the scenario 
is the zone where the vessel is located, and it can take values 
harbour, nearCoast, openS ea or intWaters. 

4.1.1. Knowledge base 

The knowledge base, whose rules are defined by a human 
operator, describes the rendezvous procedure and how it can 
be discovered and represented, and the anomalies that can be 
derived from data. A subject matter expert can help hand coding 
the knowledge rule that are accepted for each domain, or the 
models can be supervisedly learned from data. 

The formulas set consists of nineteen rules, that describe nor- 
mal and abnormal behaviours for ships. The first rules (#l-#4) 




Table 3: Knowledge base for the rendezvous scenario in FOL with associate weights 



# 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 
11 
12 

13 

14 

15 

16 

17 

18 
19 



Rule 

overlaps(v,y ) o overlaps(y,v ) 
meets(v, y ) <=> meets(y, v ) 
proximity(v, y) <=> proximity(y,v ) 
rendezvous(v,y) <=> rendezvous(y,v ) 

stopped(v ) A (isln(v, openS ea) V isln(v, intWaters)) => suspiciously) 
stopped(v ) A (isln(v, harbour) V islnly, nearCost)) => -^suspiciously) 

-iAIS (v) => alarmlv ) 

-i insideCorridor(y) => suspicious(v ) 
humintfv, smuggling) => suspiciously) 
humintly, clear) => -i suspiciously) 
suspiciously) => alarmlv) 

-isuspiciouslv) => -i alarmlv) 

is Inlv, z.) => (z 4 zp) A -\isln(y,zp) 

islnlv,z) A islnly,zp) A (z 4 zp) => -ij oroximityly ,y) 

-iproximityly ,y) => -<rendezyously,y) 

suspiciously) A suspiciously) A loverlapslv,y) V meets(v,y)) A proximity ly,y) => rendezvously ,y) 
loverlapslv,y) V meetslv,y)) A proximity ly,y) => rendezvouslv,y) 

-istoppedlv) V -i stoppedly) => -> rendezvouslv,y) 
beforelv,y) A proximityly ,y) => -■ rendezyouslv,y) 



Weight 





CO 




CO 




CO 




CO 


4/5 


CO 


2/5 


CO 




CO 


4/5 


CO 


3/5 


CO 


1/5 


CO 




CO 


1/5 


CO 




CO 




CO 




CO 




CO 


1/5 


CO 


3/5 


CO 


4/5 


CO 



describe the symmetry of some relationship as temporal ones, 
proximity and rendezvous. Then we build the anomaly identi- 
fiers: a vessel is defined suspicious if stopped in international 
waters or open sea (rule #5), if it sails outside traffic corri- 
dors (#8), or if there is a HUMINT report on it (#9), while 
the permanence in a harbour or near the coast is not consid- 
ered an anomaly (#6). If the vessel has the A1S transceiver sys- 
tem turned off, the alarm flag is raised (#7) independently from 
where the ship is located or what is doing. A suspicious vessel 
triggers an alarm (#1 1 and #12). 

The concept of proximity is shaped by rule #13, #14 and #15: 
a vessel can be located in one zone per time (#13), two vessels 
that are not located in the same area can not be close in space 
(#14), and in this last case there can be no rendezvous (#15). On 
the contrary, two vessels that are in the same area in overlapping 
time intervals define a rendezvous anomaly (#17), rule that is 
stronger if they are flagged as suspicious (#16). If one of the 
two has not stopped, the rendezvous is not possible (#18), like 
when the two vessels are in the same area in not overlapping 
time intervals (#19). 

4.1.2. Contextual information 

Generally, context does not directly provide information on 
the object of interest, but in this case the evidence assumes 
meaning only when matched with contextual information. For 
instance, the isln predicate fuses observed evidence (the vessel 



Table 4: Contextual information for the rendezvous scenario. 

- nnsideCorridorlVl ) 

-i insideCorridorlVl) 
insideCorridor(V3) 
insideCorridor(V4) 
insideCorridor(V5 ) 



position) with a priori knowledge (the zones). 

Thus in our case, the zone subdivision and the traffic corridor 
are additional information that can be exploited to better refine 
the complex events. We know that rendezvous are mostly hap- 
pening in open sea. Without zones definition, thus no context, 
we would not be able to formalize this rule. 

Moreover, context is used here to strengthen the posterior 
probability of a predicate; for instance, suspicious can be 
grounded either if the “vessel stopped at zone X'\ or HUMINT 
provided a smuggling report, or vessel is not transiting into traf- 
fic corridors. Two of these situations are generated only when 
sensory data is matched with context, and context actively con- 
tributes to reinforce the predicate. Contextual information is 
presented in Table 4. 



is!n{V\, openS ea) 
isIn{V2, openS ea) 
isIn(V3 , nearCoast) 
isIn(V4 , intWaters) 
is!n(V5 , intWaters) 
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Figure 6: Illustration of the rendezvous example: several vessels navigate at different distance from the coast. Some of them have the AIS transponder off 
(symbolized by the red and for two of them intelligence reports have been provided (represented by the speaking head). 



Table 5: Observed evidence in the rendezvous scenario. 




Table 6: Anomalies in the rendezvous scenario; the rendezvous anomalous 
event is taking place only between Vi and V 2 . 




Table S: Results for the rendezvous alarm in the first scenario with no contextual 
information provided. 
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v 3 
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V 4 


0 
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0.2 


V s 
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0.2 





4 . 1 . 3 . Results 

As concrete example, we imagine a situation in which 
V\ and V2 are having a rendezvous at open sea; they have 
the AIS transponder switched off and are close in space 
( proximity(V\,V2 )). On Vi, intelligence sources provided a 
smuggling report, indicating that historical data suggests the 
vessel may be suspicious. In the meantime, a cruise ship V5 
is transiting in international waters, a big leisure craft V3 is 
moored near the coast. Another vessel V4, for which smug- 
gling reports have been provided as well, is still in international 
waters with the AIS transponder not activated. The scenario is 
illustrated in Figure 6. From sensory data (Table 5 ), we observe 
that V\, V2, V3 and V4 are still, thus the predicate stopped is 
true. Three of them do not have AIS transponder sending mes- 
sages, and (Vi, V2) and (V4, V5) are close to each other. 



Table 7: Results for the rendezvous alarm in the first scenario. 





Vi 


v 2 


f 3 


v 4 


v 5 


Vi 
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0 
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0 




0 


0 




0.02 


0 


V 4 
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0 


0.02 




0.01 


V 5 


0 


0 


0 


0.01 





Table 9: Low-level anomalies for a single vessels, as, for instance, AIS 
transceiver switched off can raise an alarm flag. 



Alarm 




V 3 v 4 

N I 



f 5 

N 



The system is asked to answer two queries, 
P(alarm(V n )\M L ' C ) and P(rendezvous(V „ , V m )\M L? c), which 
represent the probability for the predicates alarm to be true 
for a given vessel V„, and the probability that a rendezvous 
is happening between vessels V n and V m . M l,c represents 
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Figure 7: Entities and relations of the proposed “hazmat” maritime example. 



Table 10: Results for the single vessel alarm in the rendezvous scenario. 

V, k 2 V 3 V 4 V 5 



Alarm 





0.01 



we can see, context here does not have significant discrimina- 
tive influence on results and rendezvous detection; however, it 
helps by reducing uncertainty and refining the event detection 
probability. 



Table 1 1 : Results for the single vessel alarm in the rendezvous scenario with 
no contextual information provided. 

V, v 2 1/3 V 4 Vj 



the Markov Network created groundings the set formulas L 
as per Table 3, while C is the set of constants as in Section 
4.1.1. Contextual and sensory evidences are specified in 
Tables 4 and 5 respectively. We tested the MLN with the same 
evidence set, but two different contexts: in the first case, all the 
available information is provided, while in the second case the 
contextual information is removed and only the sensory data is 
maintained. 

Ground truth values are shown in Table 6 and 9 for 
P(rendezvous(V n ,V m )\M LtC ) and P(aIarm(V„)\M ijC ) respec- 
tively. 

The results presented in Table 10 indicate that, in pres- 
ence of contextual information, the system correctly classifies 
the anomaly of having the AIS turned off or being suspicious 
enough to alert the operator, and, as suggested by Table 7, it 
also recognizes the rendezvous between Vi and V 2 with a high 
probability assigned. No other rendezvous is detected. When 
contextual information is missing, the AIS anomaly, which de- 
rives from sensory data, is detected with high confidence (Table 
11). Contrarily, the rendezvous alert, that depends on traffic cor- 
ridors and sea zones defined by context (Table 4), is classified 
as an abnormal pattern with lower confidence (Table 8 ). 

Uncertainty rises for (V 4 , V 5 ) and two single (V 3 and V 5 ) un- 
suspicious vessels when no high-level information is provided. 
This can be noticed in the detected anomalies for ships couples 
(Table 7), where the missing context suggests the possibility of 
a open sea rendezvous, and in the single-vessel alarm values 
(Table 10), which are no longer zero (or close to zero). This is 
due to the fact that the reasoner, which entirely relies on sensory 
data, assigns higher importance to the proximity predicate. As 



4.2. Hazmat scenario 

In this scenario several cargo ships head toward a harbour. 
Some of the ships carry chemical or generic hazardous materi- 
als (hazmat), as, for instance, bleach and ammonia, that when 
combined may cause a severe threat [17], The ships are as- 
signed berths in a row, and will be in the harbour before others 
or at the same time. 

The entities in our examples are, as shown in Figure 7, 
cargo , harbour , material and berth , which are linked together 
by the fact that the cargo ship, carrying some hazardous mate- 
rial ( hazMat , which can be dangerous if combined with other 
sensitive material) is heading ( isHeadingT o ) toward a certain 
harbour, in which has a berth. The predicate hasBerth takes a 
triplet of harbour, vessel and berth as argument to bound the 
three classes. The berth has a predicate adjBerth, which is 
important to indicate that two vessels are moored in adjacent 
berths, and thus are neighbours. 

Instead of the seven original predicates (before(v 1 .V 2 ), 
meets{v\,V 2 ), overlaps(v\,V2 ), starts(v\, V 2 ), during(v \ , v 2 ), 
finishes(v \ , v 2 ) and isEqualT o{v i,v 2 )), which define time of 
permanence at berths of the two ships i'i and v’ 2 , we shorten 
the list to before(v\,V2), meets(\>\, v 2 ) and overlaps(\’i, v 2 ), as 
these are the most frequent time relations between ships perma- 
nence times. In fact, a ship can leave a harbour before another 
comes in, thus the two vessels do not meet. Alternatively, it can 
stay moored for a long time, which overlaps with other ves- 
sels permanence. One more case is represented by the meeting 
event, that happens if a vessel leaves just after another one ar- 
rives; this situation is relevant as the cargo content may not be 
fully processed, and still placed on the berth, thus allowing in- 
teractions with other ships contents. Other temporal definitions 
in our domain can be considered special cases of the overlap 
relation. These predicates, that are unary relations, are impor- 
tant as they allow us to properly model the scenario time line 
and the causality between successive events. 
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Table 12: Knowledge base for the hazmat scenario in FOL with associated weights 



# Rule Weight 

1 overlap s{y,y) <=> overlaps(y,v) to 

2 meet s{y, y) <=> meet sly, v) u> 

3 neighbour s(v, y ) <=> neighbour s(y, v) to 

4 concurrently, y) <=> concurrently, v) to 

5 dangerous(ml,m2) <=> dangerousitn2,m\) to 

6 alarm(v,y) <=> alarm(y,v) to 

I meet s(y,y)v overlap siy,y) concur rent{y,y) to 

8 ~imeets(v,y) A ->overlapsiy ,y) <=> -iconcurrent(v,y) 4/5 to 

9 beforely,y) => -•concurrently, y) to 

10 -i concurrently, y) => -i alarm{v,y) to 

I I cargolv) A isHeadingT o(v, h) A harbourlh) <=> hasBerthlv, x , /z) A berthlx) to 

12 cargolv ) A cargoly) A hasBerthlv, x, h) A hasBerthly, z, h) A ac/ jBerthlx, z) <=> neighbour s(v, y) a» 

13 -i neighbour slv,y ) => -> alarm(v,y) 4/5 w 

14 cargolv ) A cargoly ) A hazMatlv ,m\) A hazMatly, m2) A ->dangerouslm\,m2) => -i alarmlv,y ) 3/5 a> 

15 cargolv) A cargoly) A hazMatlv , ml) A hazMatly, m2) A neighbourslv,y) A dangerous{m\,m2) A concurrently, y) => alarmlv,y) to 



4.2.1. Knowledge base 

The domain knowledge can be formalized with FOL formu- 
las, described in Table 12, where the higher the weight the more 
confident the statement. 

The first six rules (#l-# 6 ) codify the symmetry among ele- 
ments, and are useful to avoid sorting items; in this way rela- 
tions between ships Vx and Vy hold also viceversa. 

Rule #7 states that two vessels which meet or overlap are 
concurrent in time, simplifying the concept of “simultaneous” 
or “operative/moored at the same time”. The opposite condition 
(# 8 ) or the case when one vessel arrives or leaves before (#9) 
others define having no interaction with other ships in the sce- 
nario (being not concurrent). If two vessels are not concurrent, 
they do not represent a threat (# 10 ). 

Referring to spacial relationship, a cargo that is heading to- 
ward a harbour will have a berth assigned (# 11 ), and two ves- 
sels in the same harbour will be neighbours only if they will 
share adjacent berths (#12). If two vessels are not neighbours, 
they can not generate an alarm (#13), as well as if they transport 
cargo materials that are not dangerous when combined (#14). 

We can then define the main threat by the rule for which two 
neighbour cargo ships carry hazmats which are potentially dan- 
gerous if combined (#15). In this case the cargo ships share 
adjacent berths and are moored in the harbour at the same time. 

4.2.2. Contextual information 

Probabilistic knowledge must be integrated with explicit con- 
textual knowledge, as sensory data may be not enough to rep- 
resent and identify complex situations. A simple low-level 
anomaly detector would not detect the aforementioned threat, 
as two cargo ships which enter in a harbour, even carrying haz- 



mats, for commercial reasons raise no alarm. However, addi- 
tional information provided by context can help to identify a 
suspicious event. 

Context, described in detail in Table 13, can be represented 
in our scenario by scenario-dependent information, which is: 

• A harbour H\ has four berths B \ , . . . , If, and some of the 
berths are adjacent. The exact map of adjacent berths can 
be provided by a human operator. In our case, we codify 
the proximity with a set of symmetric rules. We suppose, 
as shown in figure 8 , that berths B\ and If are adjacent, as 
well as B 3 and If, and If and If. 

• Some materials defined by M, if combined together, are 
dangerous or potentially lethal. This information must 
necessary be provided by a SME, as it can not directly 
be inferred from materials only. In our example, we sup- 
pose that (Mi, M 2 ), (M 2 , M 3 ) and (M 2 , Mf) are dangerous 
combinations. 

As we will see in the experiments, it is important that this infor- 
mation be the most complete as possible, to depict accurately 
the scenario with its entities and relationships. 

4.2.3. Results 

We aim to demonstrate how contextual information is a cru- 
cial key element to build an exhaustive and accurate situational 
picture, which allows to timely detect an anomaly. 

We imagine a situation as the one described in Figure 8 and 
Table 14. V\ leaves the harbour prior to the arrival of V 2 and V 4 . 
After a while, V 3 reaches berth If, and it remains there when 
V 5 arrives and moors at By The fact that a ship is carrying 
hazardous material and the type of material can be classified 
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Figure 8: Illustration of the evolution of the hazmat scenario. Cargo ship Vi leaves much earlier than the arrival of V2 and V4 (a), and before V3 reaches its berth 
(b). As V 2 leaves, V5 arrives (c). 



Table 13 : Contextual information provided a priori for the “hazmat” scenario. 
Apart from harbour and its facilities, the description of dangerous combinations 
of materials is provided. 

harbour(H ] ) 
berth(B\,H\) 
berth(B 2 , H { ) 
berth(B 3 , Hi) 
berth(B 4 , Hi) 
ad jBerth(Bi , B 2 ) 
ad jBerth(B 2 , 63) 
ad jBerth(B 3 , B 4 ) 

-1 ad jBerth{Bi , 63) 

-1 ad jBerth(B \ , B 4 ) 

-1 ad jBerth(B 2 , 64) 
dangerous{Mi , M 2 ) 
dangerous(M 2 , M 3 ) 
dangerous(M 2 , M 4 ) 

->dangerous(Mi , M 4 ) 

->dangerous(M 2 , M4) 

as sensory data, as this information can be fetched on-the-fly 
when the ship becomes a vessel-of-interest or when the system 
registers the vessel. Also the time predicates can be calculated 
at runtime, comparing the ETA (Estimated Time of Arrival) and 
a minimum time of permanence to handle the ship content. 

All the cargo ships in our scenario transport hazardous mate- 
rial, but from a priori information (Table 13) we know that the 
dangerous combinations are constituted by (M \ , Mi), (Mi, M 3 ) 
and (Mi, Mf). 

The query P(alarm(V„, V m )\ M^c) represents the probability 
for predicate alarm to be true for a given vessel couple (V„, V m ), 
where M lc is the Markov Network created by grounding the 
set formulas L shown in Table 12, C is the set of constants as 
defined in Section 4.2.1, and contextual and sensory evidences 
are provided according to Tables 13 and 14 respectively. For 
sake of clarity and completeness here we consider all the pos- 
sible combinations of vessels ( V „ , V m ); however, to speed up 
the application of the proposed techniques in a real-world en- 
vironment, only vessels that will share adjacent berths could be 
selected for a check. 

In Table 15 are shown the possible risky combinations of 



Table 14 : Observed facts (evidence) in the “hazmat” scenario. The time of 
permanence of a cargo at its berth is calculated only with respect to neighbour 
cargo ships. 




Table 15 : Dangerous combinations of hazardous materials carried by cargo 
ships that share adjacent berths are highlighted in red and marked with “Y”. 
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hazardous materials carried by cargo ships that share adjacent 
berths and are moored in the harbour at the same time. Threats 
are highlighted in red and marked with “Y”, while a normal sit- 
uation is white coloured, marked with “N” and should raise no 
alarm flag. Diagonal terms give no anomaly. Hazardous ma- 
terial Ml is considered dangerous when combined with others, 
but as the cargo which carries it leaves before others, no alarm 
is raised. Materials that are brought at not adjacent berths do 
not constitute a dangerous combination, thus the couple (V 4 , V 5 ) 
does not constitute a threat. 

In Table 16 and 17 the results for this scenario are presented. 
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cargo(Vl) = 1 



cargo(Vl) = 1 
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Figure 9: Example of level of completion calculation for the alarm complex event in formula #15 in Table 12. 




Figure 10: Example of level of completion calculation for the rendezvous complex event of formulas #16 and #17 in Table 3. 



Table 16: Results for alarm raising in the “hazmat” scenario. 




Table 17: Results for the “hazmat” scenario without contextual information. 
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We performed two tests with the MLN: in both cases the ev- 
idence set is the same, but the contextual information is com- 
pletely missing in the second experiment. We tested all the pos- 
sible vessels combinations, and, when contextual information is 
provided, the reasoner set an alarm in the case of (Vi, V 3 ) and 
(V 3 , V 5 ), matching the truth (Table 15). Contrarily, no alarm is 
risen when context is missing, as the values for suspicious cargo 
ships are low. 

4.3. Completion level of complex event example 

Two examples of completion calculation are provided in the 
following. Figure 9 shows the first case where formula #15 in 
Table 12 has been grounded according to observed evidence. 
The antecedent of formula is entirely composed by a conjunc- 
tion of predicates. The literal neighbour s{V 1, V2) corresponds 
to a complex event whose components are expanded on the left. 
All the component simple events are direct observations or con- 
textual knowledge that, in the example, have resulted being true 
with complete certainty. The remaining literals in formula #15 
are direct evidence or contextual knowledge where the predi- 
cate concurrently 1, V2) has been evaluated as being false. The 
level of completion of this formula is simply calculated as the 
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average of all probabilities with a resulting 85% that indicates 
that the alarm complex event is close to fully occur. However, 
as can be seen in Table 16, the probability associated to the 
event by the Markov network is 0.05. The two results together 
provide useful information to the operator as the Markov net- 
work is correctly not considering VI and V2 as being currently 
a threat thus not rising unnecesary false alarms. However, the 
level of completion could be useful to monitor how events are 
evolving in the scenario. In this case, it is highlighting a dan- 
gerous condition close to be fully occurring. The level of com- 
pletion could also provide useful information for a posteriori 
forensic analysis. 

A second example of calculation is shown in Figure 10 illus- 
trating the grounding of formulas #16 and #17 in Table 3. The 
example is functional in showing the calculation when multiple 
formulas refer to the same complex event. In this case, formula 
#17 defines a low weight condition for the rendezvous complex 
event that is defined as two vessels that are in the same area 
in overlapping time intervals. Formula #16 provides a much 
stronger condition by considering also HUMINT information 
on the two vessels. The level of completion of a complex event 
that appears as consequent of two or more formulas is calcu- 
lated by applying the rules described above in this section to 
the disjunction of the antecedents of the formulas. Figure 10 
shows how, given accrued evidence on vessels V4 and V5, a 
rendezvous event has actually occurred. Specifically, formula 
#17 is complete while #16 is at 83%. The output probability for 
the rendezvous event for vessels is 0.01 as can be seen in Table 
6. This means that the event has occurred but is not considered 
dangerous (i.e. one of the two vessels has not stopped and is 
not considered suspicious by HUMINT reports). 

4.4. Discussion 

Analysing the results of the previous experimental section, 
we can state that MLNs constitute an improvement and a 
promising addition to the maritime situational awareness litera- 
ture. 

First of all, output probabilities can be associated to simple 
and (unobserved) complex events, and, more in general, to ev- 
ery predicate in the knowledge base. Together with the level of 
completion, this feature constitutes an important information, 
as it allows the operator to assess the level of risk associated to 
the events, and the percentage of their completion for further 
analysis. 

All FOL based approaches are problematic in the case of 
an inconsistent knowledge base because a single inconsistency 
leads to the inconsistency of the entire knowledge base from 
which anything can follow. Many approaches have been at- 
tempted at paraconsistent logics to allow for local inconsis- 
tency without global inconsistency. A survey can be found 
in [42]. However, these techniques generally tackle the prob- 
lem by weakening classical logic (e.g. negation) and thus loos- 
ing expressiveness. Loops, contradictions and inconsistencies 
between rules are, contrarily to what happens in most frame- 
works (fuzzy logic, DLs, basic expert systems, etc.), handled 
autonomously by weighing the evidence supporting the formu- 
las [18] as per (3). This means that the framework can be used 



even in presence of an inconsistent KB as it would likely be 
the case when merging the knowledge from multiple sources or 
collecting it from different experts. Probabilities are involved 
also in the construction of the knowledge base, as they are asso- 
ciated to the each formula: contradictory or subjective experts 
formulation of the problem can have associated an “a priori” 
uncertainty, which represents the degree of confidence they as- 
signed to the relational knowledge among objects and entities 
in the scenario. 

Moreover, differently from expert systems or rigid if-then- 
else rules, which require all the evidence to be provided simul- 
taneously to make a decision, the chosen framework and the 
definition of completion of events are here developed to satisfy 
the operator’s need for detecting potentially anomalous events 
while they are still in progress even in presence of incomplete 
evidence. The maritime field is favourable in stressing how 
data and information coming from heterogeneous sources can 
be fused together within the MLN framework; in fact, the do- 
main is characterized by events that typically span a consider- 
able amount of time and that slowly evolve as new information 
is acquired while they are still happening. This implies that the 
system should work with partial or missing observations, and 
not being jeopardized by them. This aspect is really valuable 
for timely anomaly and threat prediction, as the status of the 
situation can be foreseen in advance and updated as new infor- 
mation is acquired. 

Another advantage over the state-of-the-art systems is the 
possibility to query an arbitrary formula and get probabilities as 
output. For instance, the operator may want to build on-the-fly 
a custom query and test it in real-time. In the rendezvous ex- 
ample, the query for indirect predicates as suspiciously), with 
v = [VI, . . . , V5] returns the probability for each vessel to be 
suspicious, given the evidences, and more complex formulas, 
as [(suspiciously) V suspiciously)) A proximityiy ,y)\ allow to 
better investigate the situation for couples of vessels. 

Finally, the experimental results highlight the importance of 
context as a key element in a real-world domain for a timely, 
complete and accurate situation assessment. In absence of 
context, the anomaly detection performance dramatically de- 
creases; therefore, we can say that context provides the neces- 
sary discriminative power to achieve correct inferences. 

5. Conclusions 

Events and anomalies are fundamental concepts to identify 
and comprehend critical occurrences in the observed environ- 
ment. Proper event detection and understanding facilitate sit- 
uation assessment and human decision making with applica- 
tions to safety, security, consequence management and recov- 
ery among others. In this paper, we examined this concept in 
the maritime domain in the scope of the JDL fusion model ter- 
minology. 

Furthermore, we employed Markov Logic Networks as an 
efficient and robust tool which leverages both the expressive 
power of FOL and the probabilistic uncertainty management of 
Markov Networks. This allowed us to encode uncertain a priori 
and contextual knowledge, fuse data coming from multiple and 
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heterogeneous sources of information and perform reasoning 
on incomplete data. In particular, observed and contextual ev- 
idences are fed into the inference engine and reasoning is thus 
performed combining evidence from low-level data and high- 
level information. The latter, represented in our case by logic 
formulas which refer to domain entities and their relationships, 
is a key element to reduce uncertainty and achieve a more com- 
plete and accurate situational awareness picture. The results 
confirm this rationale and encourage further developments on 
more complex scenarios. 

We have also provided a mechanism for early event detection 
by evaluating the level of completion of complex events. This 
could be useful to provide early warnings before hazardous con- 
ditions actually take place. 
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